-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Makefile to support MOE #1446
Conversation
In reference to ggerganov/llama.cpp#4406 Need a newer version of llama.cpp to handle MoE models, such as Mixtral 8x7b Signed-off-by: Samuel Walker <[email protected]>
✅ Deploy Preview for localai ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
Currently building and testing locally to confirm it works. Using #backend: llama
context_size: 8192
f16: true
low_vram: false
gpu_layers: 98
mmlock: false
name: mixtral
parameters:
model: mixtral-8x7b-v0.1.Q4_K_M.gguf
temperature: 0.2 |
Hmm, that doesn't appear to be where to change it. |
Maybe gollama needs to be updated as well? |
Tried with
To reflect upstream update attempt go-skynet/go-llama.cpp#313 but failed
|
seems to be the same issue in their CI https://github.com/go-skynet/go-llama.cpp/actions/runs/7212854485/job/19651445657?pr=313 |
Trying against go-skynet/go-llama.cpp#315 |
hmm
|
did you tried with the |
I tried both with backend: llama commented and uncomment. No success. The last comment was with it uncommented. |
did you tried without acceleration too? |
Didn't try this model without acceleration, but I did try another model with acceleration and it worked just fine. |
maybe it is an upstream issue, the llama-cpp backend is the most close to upstream one, if that fails something might be off with llama.cpp. master just got latest hash in #1429, I'll try to give it a go later today too |
I'll give that a try today |
Tried today and works locally, adding a full example in #1449 |
@sfxworks appreciate the effort here, but I think we can close this one as we have more up-to-date hashes in master, or is there anything pending? did you tried if mixtral works for you? |
Yep! All works I appreciate it! |
In reference to ggerganov/llama.cpp#4406
Need a newer version of llama.cpp to handle MoE models, such as Mixtral 8x7b
Description
This PR fixes #1421
Notes for Reviewers
Signed commits